Czech audio-visual speech corpus of a car driver for in-vehicle audio-visual speech recognition

نویسندگان

Milos Zelezný

Petr Císar

چکیده

This paper presents the design of an audio-visual speech corpus for in-vehicle audio-visual speech recognition. Throughout the world, there exist several audio-visual speech corpora. There are also several (audio-only) speech corpora for in-vehicle recognition. So far, we have not found an audiovisual speech corpus for in-vehicle speech recognition. And, we have not found any audio-visual speech corpora for the Czech language either. Since our aim is to design an audio-visual speech recognizer for in-vehicle recognition, the first thing we had to do was to design, collect, and process the Czech invehicle audio-visual speech corpora. The purpose of in-vehicle speech recognition is usually its utilization for command control of car features, which does not involve driver’s hands. Thus, in real deployment, it will be the driver, whose speech will be recognized. Although it is more demanding than to collect the speech of a passenger, we decided to collect the driver’s speech for training purposes. This is probably not so important for audio-only speech corpus, but for our purpose we need to collect speech in real conditions, i.e. conditions that include head movements caused by the fact that the driver has to pay attention to the traffic situation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

3d Lip-tracking for Audio-visual Speech Recognition in Real Applications

In this paper, we present a solution to the problem of tracking 3D information about the shape of lips from 2D picture of a speaker. We focus on lip-tracking of audio-visual speech recordings from the Czech in-vehicle audio-visual speech corpus (CIVAVC). The corpus consists of 4 h 40 min records of audiovisual speech of driver recorded in a car during driving in an usual traffic. In real condit...

متن کامل

AV@CAR: A Spanish Multichannel Multimodal Corpus for In-Vehicle Automatic Audio-Visual Speech Recognition

This paper describes the acquisition of the multichannel multimodal database AV@CAR for automatic audio-visual speech recognition in cars. Automatic speech recognition (ASR) plays an important role inside vehicles to keep the driver away from distraction. It is also known that visual information (lip-reading) can improve accuracy in ASR under adverse conditions as those within a car. The corpus...

متن کامل

Design and recording of Czech speech corpus for audio-visual continuous speech recognition

In this paper we describe the design, recording, and content of a large audio-visual speech database intended for training and testing of audio-visual continuous speech recognition systems. The UWB05-HSCAVC database contains high resolution video and quality audio data suitable for experiments on audio-visual speech recognition. The corpus consists of nearly 40 hours of audiovisual records of 1...

متن کامل

Design and Recording of Czech Audio-Visual Database with Impaired Conditions for Continuous Speech Recognition

In this paper we discuss the design, acquisition and preprocessing of a Czech audio-visual speech corpus. The corpus is intended for training and testing of existing audio-visual speech recognition system. The name of the database is UWB-07-ICAVR, where ICAVR stands for Impaired Condition Audio Visual speech Recognition. The corpus consist of 10000 utterances of continuous speech obtained from ...

متن کامل

Innovations in Czech audio-visual speech synthesis for precise articulation

This paper presents new steps toward animation of precise articulation. The acquisition of audio-visual corpus for Czech and new method for parameterization of visual speech was designed to obtain exact speech data. The parameterization method is primarily suitable for training a data driven visual speech synthesis systems. The audio-visual corpus includes also specially designed test part. Fur...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2003

Czech audio-visual speech corpus of a car driver for in-vehicle audio-visual speech recognition

نویسندگان

چکیده

منابع مشابه

3d Lip-tracking for Audio-visual Speech Recognition in Real Applications

AV@CAR: A Spanish Multichannel Multimodal Corpus for In-Vehicle Automatic Audio-Visual Speech Recognition

Design and recording of Czech speech corpus for audio-visual continuous speech recognition

Design and Recording of Czech Audio-Visual Database with Impaired Conditions for Continuous Speech Recognition

Innovations in Czech audio-visual speech synthesis for precise articulation

عنوان ژورنال:

اشتراک گذاری